Genomic epidemiology of Salmonella Typhi in Central Division, Fiji, 2012 to 2016

Summary Background Typhoid fever is endemic in some Pacific Island Countries including Fiji and Samoa yet genomic surveillance is not routine in such settings. Previous studies suggested imports of the global H58 clade of Salmonella enterica var Typhi (Salmonella Typhi) contribute to disease in these countries which, given the MDR potential of H58, does not auger well for treatment. The objective of the study was to define the genomic epidemiology of Salmonella Typhi in Fiji. Methods Genomic sequencing approaches were implemented to study the distribution of 255 Salmonella Typhi isolates from the Central Division of Fiji. We augmented epidemiological surveillance and Bayesian phylogenomic approaches with a multi-year typhoid case-control study to define geospatial patterns among typhoid cases. Findings Genomic analyses showed Salmonella Typhi from Fiji resolved into 2 non-H58 genotypes with isolates from the two dominant ethnic groups, the Indigenous (iTaukei) and non-iTaukei genetically indistinguishable. Low rates of international importation of clones was observed and overall, there were very low levels an antibiotic resistance within the endemic Fijian typhoid genotypes. Genomic epidemiological investigations were able to identify previously unlinked case clusters. Bayesian phylodynamic analyses suggested that genomic variation within the larger endemic Salmonella Typhi genotype expanded at discreet times, then contracted. Interpretation Cyclones and flooding drove ‘waves’ of typhoid outbreaks in Fiji which, through population aggregation, poor sanitation and water safety, and then mobility of the population, spread clones more widely. Minimal international importations of new typhoid clones suggest that targeted local intervention strategies may be useful in controlling endemic typhoid infection. These findings add to our understanding of typhoid transmission networks in an endemic island country with broad implications, particularly across Pacific Island Countries. Funding This work was supported by the Coalition Against Typhoid through the Bill and Melinda Gates Foundation [grant number OPP1017518], the Victorian Government, the National Health and Medical Research Council Australia, the Australian Research Council, and the Fiji Ministry of Health and Medical Services.

Evidence before the study Typhoid fever, caused by the human pathogen Salmonella Typhi, is endemic in some Pacific Island countries including Fiji. There is a lack of understanding of the population structure of Salmonella Typhi in Pacific Island Countries. We searched PubMed (manuscripts prior to September 2021) with no language restrictions using the terms "typhoid" and "Fiji" and"genomics". The one resulting study examined the global dissemination of the single Salmonella Typhi clone, termed H58, throughout the world which included sporadic samples from Fiji and neighbouring countries. A second Salmonella Typhi genotyping paper from our extended research team also included sporadic genome samples from Fiji, yet these were restricted to ad hoc collections, often travellers, that lacked allied metadata. No study has undertaken a detailed longitudinal genomic epidemiology assessment to characterise the evolutionary dynamics of typhoid clones in a Pacific Island Country such as Fiji.

Added value of this study
In this retrospective, genomic epidemiology study, we present a spatiotemporal analysis of typhoid cases from the Central Division of Fiji from 2012 to 2016. We identify that 99% of typhoid cases in Fiji are driven by two closely related Salmonella Typhi genotypes (genotype 4.2.1 and 4.2.2) with extended regional analyses indicating that Pacific Island countries appear to have their own genetically distinct typhoid genotypes. Through coupling of genomics and epidemiology, we define genome clusters and track their spread associated with documented public health outbreaks. Phylogenomic analyses revealed 'waves' of typhoid cases were associated with population displacement through major events such as cyclones.

Implications of all the available evidence
Our study highlights that typhoid in Fiji is an endemic situation driven through evolution and transmission of regional Salmonella Typhi clones, rather than driven through frequent importation of new clonal variants.

Introduction
The Republic of Fiji consists of 332 islands, and approximately 890,000 people, 1 located in the tropical south Pacific. The largest island, Viti Levu, is volcanic in origin and the eastern slopes receive approximately 3 metres of rain per year. The tholeiitic basalt geology of Viti Levu mitigates against improved sanitation, and typhoid control will depend on better understanding of transmission, improved sampling, increased diagnosis, contact tracing, and ultimately, also vaccination. The origins of typhoid fever in Fiji are unclear but the reported incidence increased significantly in the mid-2000s and the trend for elevated numbers of cultureconfirmed cases has continued. [2][3][4][5] A seroepidemiology study in 2014 suggested that 32.2% of the Fijian population have detectable and often high levels of anti-Vi antibodies, typically a marker of recurrent typhoid infections. 6 Against this background of endemicity, focal outbreaks associated with major tropical storms and cyclones, increasingly attributed to climate change and the warming of nearby oceans, occur. 7 Transmission remains somewhat enigmatic. The causitive bacterium for typhoid fever, Salmonella Typhi, is considered monophyletic, [8][9][10][11] human hostrestricted, is difficult to isolate from the environment, and is generally transmitted faecally-orally through contaminated food or water. 12,13 Some individuals infected with Salmonella Typhi become asymptomatic carriers, often harbouring the bacterium in their gallbladder; and intermittently transmit the pathogen through contaminated feces.
Our study used bacterial genomics to address two important questions that will impact on the control of typhoid fever in Fiji. First, is the pathogen related to the dominant multi-drug resistant (MDR) global H58 clade of Salmonella Typhi 11,14 and are there ongoing importations? Second, can genomics-based micro-epidemiology be used to help resolve the networks of transmission of typhoid fever in Fiji? The results will focus and enhance Articles surveillance of typhoid fever in Fiji, and in the region, as a prelude to the introduction of the new conjugate typhoid vaccine. 15 16 which ran from 27 January 2014 through 31 January 2017 were encompassed within this genomic epidemiology study. 240 isolates represent culture confirmed blood isolates. Ten stool culture isolates from public health reponses to localised typhoid outbreaks were also included. Isolates were coupled with allied epidemiological metadata where available including known epidemiological outbreak cases as undertaken by Fiji Centre for Disease Control and/or in conjunction with the recent case-control study. 16 The households of 129 cases with 135 samples accounting for paired blood/stool samples were also geo-located with a hand-held GPS as part of contact tracing processes. By definition, a typhoid outbreak in Fiji is based on the 2010 national guidelines where 2 or more suspected or confirmed cases of typhoid fever are identified within 1 month in a new area/village. Laboratory confirmed Salmonella Typhi isolates were sent to the Microbiological Diagnostic Unit -Public Health Laboratory of the University of Melbourne, Australia, for genome sequencing. DNA was extracted from a single colony using a QIAsymphony DSP virus pathogen kit (Qiagen). Genome sequencing was performed using Nextera XT (Illumina) on the NextSeq 500/550 platform using 150 paired-end reads (Doherty Applied Microbial Genomics, Melbourne).

Complete genome sequence of endemic Fijian genotype 4.2.2
The complete genome of a Fijian endemic genotype 4.2.2 strain ERL072973 was sequenced on an Illumina HiSeq2500 in rapid run mode, as described previously, 11 to produce two sets of paired ended short read data (ERR343279, ERR343374), with a read length of 100 bases. The sample was also sequenced on a PacBio RSII with 2 SMRT cells to produce two sets of long read data (ERR581097, ERR581107). A hybrid assembly was performed using Unicycler (v0.4.0) 17 with the long and short read datasets. Corrected reads were subsequently generated using the PacBio SMRT analysis pipeline (v2.3.0) for the long read datasets. The Unicycler assembly and the corrected reads were provided to circlator (v1.4.0) 18 Table 3). For core genome determination and tree building mobile genetic elements and genomic regions of irregular SNP density were identified in the reference genome and the isolate core genome alignment using Gubbins v2.4.1. 21 All low complexity mapping regions, high SNP density regions and mobile genetic elements were then excised from the alignment resulting in a 4,570,660 bp core genome alignment consisting of a total of 492 core genome SNPs, including 156 parsimony informative sites and 336 singleton sites.
To put the Fijian case-control genome sequencing within a global context, a global database of genome sequences was collated (Supplementary Table 4) which is an extension of initial global framework 10 with the addition of published datasets from India, Nigeria, and Uganda. 22-24 To reconstruct the global Salmonella Typhi phylogeny, the Fiji sequence reads were mapped to the global CT18 reference genome (AL513382) using the same parameters as previously described. 10 SNPs in repeat regions, prophage regions, and plasmid sequences (»354 kb) were excluded for phylogenetic analyses. This global framework of 2,643 isolates (Supplementary Table 4) resulted in a set of 26,803 chromosomal SNPs (in respect to CT18 reference genome) within an alignment of length 4,275,037 bp.

Phylogenetic and phylodynamic analyses
Phylogenetic relationships were inferred by both maximum-likelihood and Bayesian inference, using the single nucleotide polymorphism (SNP) alignment. Consensus SNP alignments were used to build a maximum-likelihood tree with IQ-TREE v1.6.11. 28 A general time-reversible model with a gamma distribution to describe among-site rate heterogeneity was selected (GTR+F+G) for analysis in IQ-TREE, and with 1000 ultra-fast nonparametric bootstrap replicates to assess topological uncertainty.
Molecular clock phylogenetic analysis was conducted with a Bayesian approach in BEAST v1.10 29 using an GTR+G substitution model with an uncorrelated lognormal relaxed clock model (UCLN). The UCLN model was selected over a strict clock (SC) model after inspecting the coefficient of rate variation (standard deviation of branch rates divided by their mean) in the UCLN model, an informal measure of clocklike behaviour that has been shown to have similar performance to methods based on marginal likelihoods. 30 However, we note that the estimates from both molecular clock models had overlapping posterior densities. We set the Skygrid tree prior, a semi-parametric model where population size is estimated at different coalescent intervals but large demographic changes are penalised. 31 To sample from the posterior distribution a Markov chain Monte Carlo was run for 10 8 iterations with sampling every 10 4 iterations. The first 10% of steps from the chain were discarded as burn-in. Sufficient sampling was assessed from the stationary distribution by verifying that the effective sample size for key parameters was at least 200. Convergence was assessed by repeating the analyses and ensuring that the posterior samples matched.
To estimate the infected population size, we fit a constant exponential growth coalescent model. 32 Although this parametric model does not have the flexibility of skyline methods, it has explicit assumptions and the estimates of demographic parameters are straight-forward to interpret. In particular, here we assume that we can approximate the average population trajectory with an exponential function and that the duration of infection of about 7 days. We estimated an infected population size at the time of collection of the most recent sample of around 2000 individuals (posterior mode = 2049.739; 95% credible interval: 1490.442 -2777.179). Importantly, our estimate for this parameter was robust to the prior specified in the model (Supplementary Figure 6).

Whole genome clustering
To determine whole genome clusters, the filtered core SNP alignment of the 251 genomes was loaded into R and pairwise SNP distances calculated using the dist. dna function in ape 33 (model="N"). Hierachical clustering of the isolates was performed using the hclust function as part of the R 'stats' package and clusters were filtered using a threshold of 2 SNPs for cluster membership using cutree. The 2 SNP threshold was defined based on the SNP accumulation rate across the whole genome in our study (»0.6 SNPs per year). Clustering at a 0 SNP threshold was also performed to help link identical clones within clusters. Clusters were plotted if containing greater than 2 isolates at a maximum 2 SNP threshold. Genome clusters were also plotted against the phylogenetic tree using ggtree. 34 Early Career Researcher Award (DE190100805). GD was supported by an NIHR BRC AMR award and funding from UKRI Vaccine Hub, UKRI AMR, STRATAA and TyVac Gates. The funders had no role in study design, data analysis, data interpretation, writing of this report or decision to submit the paper for publication.

Dominance of non-H58 Salmonella Typhi genotypes in Fiji
In this study, Salmonella Typhi was isolated from blood and/or stool of suspected patients in the Central Division of Fiji and the disease diagnosed using standard microbiology and Salmonella Typhi-specific Vi capsular antiserum. The genomes of 255 Salmonella Typhi isolates were initially analysed (Supplementary Table 1), and the patient demographics are shown in Table 1. The isolates were predominantly collected during a case-control study of typhoid fever in the Central Division of Fiji from 2012 to 2016. 16 The households of 129 cases were Articles also geo-located with a hand-held global positioning system (GPS).
The genomics of the isolates were placed into a global context by mapping the data against a framework of 2,466 Salmonella Typhi isolates from 6 continents. 10,14 Within this extended framework, isolates from the Fijian case-control study isolates clustered into three main ancestral genotypes ( Figure 1A, purple ring). Intriguingly, genotype 4.2 provided 389 (99%) of 391 clinical isolates from Fiji, indicating that genotype 4.2 represents a geographically constrained genotype that is endemic to the Fijian islands. The remaining two genotype 4.2 isolates were from Tonga. 10 The three Fijian case-control study genotype 3.5 isolates clustered with isolates that are ancestral to genotype 3.5.4 that accounted for 105 (99%) of 106 Samoan isolates, 10 suggesting that some strains may circulate between Pacific Island Countries, even though the Pacific Countries appear to have their own genetically distinct genotypes (Supplementary Figure 2).

Low rates of multi-drug resistant Salmonella Typhi in Fiji
Given the global rise in ciprofloxacin and ceftriaxone resistance in Salmonella Typhi 11,14 we screened genome assemblies of Fijian isolates for the presence of resistance alleles. Two isolates carried mutations in DNA gyrase (gyrA (S83F)), which are linked with ciprofloxacin resistance. 36 These isolates were collected in 2014 and 2015 and resistance was phenotypically confirmed. All the isolates collected were positive for the typhoid toxin, 26 although re-analyses of the historical Fijian isolate database 10 identified one isolate that appeared to be toxin-negative. Unfortunately, the case files for this patient were not accessible.
Phylogenomic 'waves' of endemic sub-clones associate with population displacement  Figure 1B Figure 1D). Application of an exponential growth coalescent model to the dataset 32 allows for a crude estimation of the putative number of typhoid infections at the date of the last collected genome sample. We estimated an infected population size at the time of the most recent sample is approximately 2,000 individuals (HPD: 1490−2777) in the Central Division of Fiji. Importantly, as this model assumes that population dynamics can be approximated by a deterministic exponential function, estimates of population size should be interpreted with caution. The overall rate of evolution of genotype 4.2.2 is 1.27£10 À7 subs/site/year (HPD: 9.48£10 À8 -1.69£10 À7 ) which is comparable to the published rate of 1.7£10 À7 (CI; 1.1£10 À7 -2.2£10 À7 ) for 4.3.1 (H58) 37 (Supplementary Figure 3).
Periods where increased genomic variation occurred were marked by outbreaks associated with cyclones  ( Figure 1D) and genomic epidemiology suggesting that cases in the population displaced by cyclones resulted in spread of Salmonella Typhi clones into other geographical settings. Fiji was impacted by Cyclone Daman in 2007, Cyclone Tomas in 2010, Cyclone Evan in 2012, Cyclone Winston in 2016. The cyclones damaged housing and/or produced rain depressions which inundated sanitation facilities, often affecting water supplies, an underlying risk factor resulting in increased Salmonella Typhi exposure in Fiji. 38,39 It is tempting to speculate that increased cyclonic intensity, driven by climate change, 40 is accelerating the evolution of Salmonella Typhi in Fiji.

Co-circulation and transmission dynamics of genomic clusters
To determine the phylogeographical context of the typhoid fever cases we examined, GPS coordinates were obtained for 128 (50.2%) of 255 typhoid isolates. To test the hypothesis that the two major subclades were geographically dispersed, maps were drawn for Fijian genotypes 4.2.1 and 4.2.2. Phylogeographical analyses suggests that there is no geographical structure of the two 4.2 sublineages; both are equally distributed ( Figure 1C) and that there is no apparent geographical restriction to the movement of these two clinically relevant genotypes within the Central Division of Fiji.
The phylogeographic study suggested that many of the diagnosed typhoid infections in our study resulted from common source outbreaks, with samples drawn from a particular location tending to be clustered in the phylogenetic tree. A total of 27 genomic clusters (defined by ≤ 2 core chromosomal SNPs, > 2 isolates) were identified comprising 71% (177/251) of reported cases with a median of 8 cases per cluster (range 3 to 27 cases, IQR 5 − 18 cases) ( Figure 2). The median duration of persistence of a cluster was 335 days (range 2 to 1,445 days, IQR 161 to 771 days). Many of these genomic clusters were not geographically constrained (Supplementary Figure 4), consistent with phases of long infection quiescence followed by both local and regional transmission. These data indicate that multiple typhoid clones circulate at any one time, some of which can persist for multiple years in either within a human host or unsampled reservoir. Two major typhoid outbreaks were subjected to public health investigation by the Fiji Ministry of Health and Medical Services during the case-control study, one in Wailoku settlements associated with a commemorative gathering 41 and another at Qelekuro associated with Cyclone Winston. These two outbreaks were genetically unrelated, but demonstrate the spread the clones through different regional settings after communal gatherings and/or displacement through climatic events (Supplementary Figure 5). We conducted a detailed analysis of cases in Wailoku settlements over a 5-month period in 2014 (Figure 3). The reported outbreak from May to June involved 22 cases, where there were 13 laboratory confirmed cases and 9 probable cases. 41 Further cases were diagnosed in same settlement areas in August, September and October 2014. Genomics on the isolates obtained from the total 19 Wailoku patients revealed fewer than 3 SNP differences, suggesting that the infections identified in August, September and October were unlikely to be exogenous reintroductions ( Figure 3). Case workers determined that an individual infected in October was the child of a patient diagnosed in early June. One patient, a Fijian of Indian descent was diagnosed on June 21, and genomics revealed that this patient was infected with the same outbreak clone. Genomic analysis of previous Salmonella Typhi isolates from Fiji suggest that the Wailoku isolate was present elsewhere in Fiji in 2013, and, therefore, that the clone was first introduced into population on May 17, 2014 at a large memorial gathering. These findings support the use of bacterial genomics to identify transmission networks of typhoid fever in disease endemic settings like Fiji.

Discussion
Typhoid fever is endemic in some Pacific Island Countries [42][43][44] including Fiji. 4,5 In this study, we applied genomic epidemiological and Bayesian statistical approaches to shed new light on the population dynamics of Salmonella Typhi in the Central Division of Fiji. In contrast to genomic epidemiological investigations from typhoid endemic settings such as Asia and Africa where the disease is dominanted by the multi-drug resistant H58 clone of Salmonella Typhi, [9][10][11]14,16,[22][23][24][25] typhoid in Fiji is driven a genetically distinct genotype (genotype 4.2 subclades). In further contrast to these extant disease endemic settings, very low levels (<1%) of Fijian typhoid isolates were resistant to antimicrobial agents including ciprofloxacin, conferred by DNA gyrase polymorphisms. Such resistance appeared sporadic and has not become more widely established, perhaps due to strict control of ciprofloxacin use in Fiji. We also observe that Pacific Countries appear to have their own genetically distinct Salmonella Typhi genotypes, with infrequent transmission detected, yet this requires broader surveillance networks.
Genomics cannot explain the incidence discrepancy between the two major Fijian ethnic groups, the iTaukei (or indigenous Fijians), and the Fijians of Indian Descent (FID), which show similar Vi seroprevalence. 6 This shared seroprevalence is in stark contrast to the reported incidence of typhoid fever in the two communities, where infections in FID are rarely reported. 6 Differences in proportion of infections that are symptomatic by ethnicity might be explained by early presentation and syndromic treatment, access to treatment, or host genetics. In the case-control study, 16 there were 7 typhoid cases in FID; where the isolate was Figure 2. Timeline of Salmonella Typhi genomic clusters in Fiji. A) Each typhoid case is represented by a single dot and are classified as being sporadic single genomic cases (top box) or belonging to 27 genomic clusters (defined as containing 3 or more isolates related by <=2 core chromosomal SNPs). Cases are plotted by date of sampling (x-axis) and genomic cluster (y-axis) with size of cluster relative to total number of isolates in the cluster. Dotted lines refer to exact clones (no SNP differences). Triangle refer to isolates collected before case-control study where date of collection was estimated. B) Phylogenetic relationship of 251 Central Division isolates built from 252 SNPs and color coded by genomic cluster represented in (A). Multiple genomic outbreak clusters are represented at any one time some of which can persist for several years, some of which can spread between geographical regions (Supplementary Figure  4). obtained, the bacteria were members of the same genotypes that were present in the iTaukei. One exception was the Salmonella Typhi isolate that was brought to Fiji by a FID who had travelled from India; this isolate was from the H58 clade. Typhoid fever in India, like many countries, is now dominated by the H58 clade. 24 While previous global analyses revealed several Fijian importations of H58 Salmonella Typhi, 11 the H58 genotype is yet to displace the endemic Salmonella Typhi in Fiji, suggesting that targeted local intervention strategies may be useful in controlling endemic typhoid infection.
The phylogenetic studies reveal the power of genomics to track outbreaks, both within villages and across regional settings. The genomics data suggest that, despite Salmonella Typhi having a monophyletic population structure, it is possible to definitively identify whether disease in a village is caused by single or multiple introductions. Through these investigations, multiple typhoid clones were identified to circulate at any one time, some of which can persist for multiple years in either within a human host or unsampled reservoir. Combined with anti-Vi antibody assays, which can also be used to help identify carriers, 6 genomics is an important tool in elucidating transmission pathways, identifying the causative genotypes in waves of localised outbreaks and informing intervention strategies.

Data sharing statement
Illumina sequence reads and draft genome assemblies were deposited into the European Nucleotide Archive (Bioproject identifier PRJNA739044). Accession numbers for individual sequence reads are supplied in Supplementary Table 1.

Editor note
The Lancet Group takes a neutral position with respect to territorial claims in published maps and institutional affiliations.

Declaration of interests
The authors declare no competing interests.